Rank | Count | Beginning |
---|---|---|
69955 | 11241 | The |
26121 | 2578 | He |
31428 | 2508 | I |
35371 | 2370 | In |
39202 | 2209 | It |
12046 | 1756 | But |
85922 | 1627 | This |
407 | 1585 | A |
91731 | 1479 | We |
63160 | 1214 | She |
84191 | 1213 | They |
4553 | 1106 | And |
32578 | 1017 | If |
7197 | 905 | AS |
22712 | 845 | For |
79424 | 836 | There |
31429 | 773 | “I |
98841 | 644 | You |
95587 | 631 | When |
91756 | 620 | “We |
54653 | 610 | On |
68873 | 608 | That |
1869 | 540 | After |
70035 | 538 | “The |
8556 | 504 | At |
41223 | 499 | It’s |
54911 | 478 | One |
65393 | 475 | So |
97592 | 471 | With |
96567 | 453 | While |
In the next four subsections show the most frequent sentence beginnings consisting of N words, N=1, 2, 3, 4. In this subsection we start with N=1.
The most frequent word-N-grams at the beginning of sentences give some insight into sentence composition.
Especially for N=1, we only need a small corpus to identify the most frequent sentence beginnings.
select substring_index(sentence, ' ', 1) as beg, count(*) as cnt from sentences group by substring_index(sentence, ' ', 1) order by cnt desc limit 50;
4.3.1.2 Most Frequent Sentence Beginnings II
4.3.1.3 Most Frequent Sentence Beginnings III
4.3.1.4 Most Frequent Sentence Beginnings IV
4.3.1.1 Most Frequent Sentence Endings I
4.3.1.2 Most Frequent Sentence Endings II
4.3.1.3 Most Frequent Sentence Endings III
4.3.1.4 Most Frequent Sentence Endings IV